Saudi accented Arabic voice bank

نویسندگان

  • Mansour Al-Ghamdi
  • Fayez A. Alhargan
  • Mohamed I. Alkanhal
  • Ashraf Alkhairy
  • Munir Eldesouki
  • Ammar Alenazi
چکیده

The aim of this paper is to present an Arabic speech database that represents Arabic native speakers from all the cities of Saudi Arabia. The database is called the Saudi Accented Arabic Voice Bank (SAAVB). Preparing the prompt sheets, selecting the right speakers and transcribing their speech are some of the challenges that faced the project team. The procedures that met these challenges are highlighted. In the project, 1033 speakers speak in Modern Standard Arabic with a Saudi accent. The SAAVB content was analyzed and the results are illustrated. The content was verified internally by the project team and externally by IBM Cairo and can be used to train speech engines such as automatic speech recognition and speaker verification systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

West Point, SAAVB, and BBN/AUB Arabic Speech Corpora: A Comparative Survey

The aim of this paper is to evaluate three public Arabic speech corpora, namely the West Point (WP), Saudi Accented Arabic Voice Bank (SAAVB) and the BBN Technologies/American University at Beirut (BBN/AUB) corpus by referring the TIMIT English speech corpus as benchmark. Weaknesses, strengths, and discrepancies of these Arabic corpora regarding their design and content are covered in this pape...

متن کامل

PROCESSING TIME EFFECTS OF SHORT-TERM EXPOSURE TO FOREIGN-ACCENTED ENGLISH by Constance

Non-native speech can cause perceptual difficulty for the native listener, but experience can moderate this difficulty. This study explored the perceptual benefits of brief exposure to non-native speech. A cross-modal word matching paradigm was used to investigate perception of foreign-accented speech by native English listeners during the first moments of exposure. In 5 experiments, processing...

متن کامل

Using Machine Learning Algorithms for Automatic Cyber Bullying Detection in Arabic Social Media

Social media allows people interact to express their thoughts or feelings about different subjects. However, some of users may write offensive twits to other via social media which known as cyber bullying. Successful prevention depends on automatically detecting malicious messages. Automatic detection of bullying in the text of social media by analyzing the text "twits" via one of the machine l...

متن کامل

The interaction of long-term voice quality with the realisation of focus

Voice quality shifts have been shown to be associated with the realisation of accent, focus and deaccentuation. Mostly, accented and focally accented syllables are reported to exhibit a tenser mode of phonation than the unaccented, but accentuation with a laxer/breathier quality is also reported. Possibly, the long-term voice quality of the speaker (or of the utterance) influences the voice qua...

متن کامل

A Pragmatic Solution to an Indian Accented English Speech Synthesizer Using Residual Excited Linear Predictive Coded Voice

This paper elucidates a practical solution to an Indian accented English text to speech synthesizing system. The paper covers the complete procedure to generate the speech signal of the text, in Indian accented voice. The technique described considers the various prosodic features that need to be incorporated into the synthesized speech to make it appear natural and in the way an Indian speaks....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008